Signal-based accent and phrase marking using the fujisaki model
نویسندگان
چکیده
Automatic prosodic marking is very important in speech signal processing, since its results are required in many subsections, e.g. speech synthesis and speech recognition. The most important prosodic features on the linguistic level are the marking of accents and phrases. In this paper, we develop an automatic algorithm for marking accents and phrases which analyzes the F0 contour by using the quantitative Fujisaki model. The results of automatic extraction of accents and phrases have been compared to the human labeling performance. The success rate of accent and phrase marking amounts to 77.11% and 67.12%, respectively.
منابع مشابه
DNN-SPACE: DNN-HMM-Based Generative Model of Voice F0 Contours for Statistical Phrase/Accent Command Estimation
This paper proposes a method to extract prosodic features from a speech signal by leveraging auxiliary linguistic information. A prosodic feature extractor called the statistical phrase/accent command estimation (SPACE) has recently been proposed. This extractor is based on a statistical model formulated as a stochastic counterpart of the Fujisaki model, a wellfounded mathematical model represe...
متن کاملResynthesis of Prosodic Information Using the Cepstrum Vocoder
The naturalness of synthetic speech depends on automatic extraction of prosodic features and prosody modeling. To improve the naturalness of the synthesized speech, we want to apply the concept of Analysis-by-Synthesis of prosodic information. Therefore, the accents and phrases of the speech signal were extracted using the quantitative Fujisaki model in a recognition model. In a generative mode...
متن کاملIntonation recognition for indonesian speech based on fujisaki model
In this paper, we proposed to use the Fujisaki parameter to distinguish between declarative and interrogative intonation in Indonesian speech. Four combinations of Fujisaki parameter were selected as the features to distinguish between declarative and interrogative intonation. The first combination is only the amplitude of last accent command. The second combination consists of the amplitude of...
متن کاملA quantitative description of German prosody offering symbolic labels as a by-product
The prosodic quality of a text-to-speech system is important for the intellegibility and perceived naturalness of synthetic speech. In earlier works the author developed a linguistically motivated model of German intonation based on the quantitative Fujisaki model of the production process of F0. The current paper compares results yielded by automatic Fujisaki modeling with a GToBI-style anotat...
متن کاملOn the Alignment of Prosodic Events
The current study examines the relationship between intonational gestures as given by the accent commands of the Fujisaki model and the syllabic grid on the example of spontaneous American English from the Buckeye Corpus. As an initial step the data were labelled according to American English ToBI conventions. Intensity contours were extracted from the band-filtered speech signal and modelled u...
متن کامل